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Previous genome-wide association studies (GWASs) identified multiple susceptibility loci that have highlighted the 
important role of TLR (Toll-like receptor) and CARD (caspase recruitment domain) genes in leprosy. A large three- 
stage candidate gene-based association study of 30 TLR and 47 CARD genes was performed in the leprosy samples 
of Chinese Han. Of 4363 SNPs investigated, eight SNPs showed suggestive association (P < 0.01 ) in our previously 
published GWAS datasets (Stage 1). Of the eight SNPs, rs2735591 and rs4889841 showed significant association 
(P< 0.001) in an independent series of 1504 cases and 1502 controls (Stage 2), but only rs2735591 (next to 
BCL10) showed significant association in the second independent series of 938 cases and 5827 controls (Stage 
3). Rs2735591 showed consistent association across the three stages (P> 0.05 for heterogeneity test), significant 
association in the combined validation samples (P cor rected = 5.54 x 1 0 4 after correction for 4363 SNPs tested) 
andgenome-widesignificanceinthewholeGWASand validation samples(P = 1.03 x 10 9 ,OR = 1.24). In addition, 
we demonstrated the lower expression of BCL 10 in leprosy lesions than normal skins and a significant gene con- 
nection between BCL Wand the eight previously identified leprosy loci that are associated with NFkB, a major regu- 
lator of downstream inflammatory responses, which provides further biological evidence for the association. We 
have discovered a novel susceptibility locus on 1 p22, which implicates BCLWas a new susceptibility gene for lep- 
rosy. Our finding highlights the important role of both innate and adaptive immune responses in leprosy. 



INTRODUCTION countries, with more than 200 000 new cases being detected 

each year (1). It affects both the skin and peripheral nerves and 
Leprosy is a chronic human infectious disease caused by the can cause irreversible impairment of nerve function and 
intracellular pathogen Mycobacterium leprae (M. leprae), chronicdisabilities(2).Ithasbeenproventhatbothenvironmen- 
which remains a serious health problem in the developing tal factors and host genetics play a critical role in the prevalence 
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of leprosy with estimated hereditable fractions of up to 57% 
(3,4). 

Clinical manifestations of leprosy are classified according to 
World Health Organization (paucibacillary and multibacillary) 
schemes, reflective of a Thl (cell-mediated) or Th2 (humoral) 
host immune response, respectively. Its divergent clinical 
forms reflect two distinct immune responses to the same patho- 
gen, which maintains the importance of host genetics in suscep- 
tibility to leprosy. The results of genetic studies suggest the 
involvement of host genetics in both susceptibility to leprosy 
per se and development of clinical subtypes (5). 

In 2009, we performed the first GWAS of leprosy and identi- 
fied six susceptibility loci (CCDC122, LACC1, NOD2, 
TNFSF15, HLA-DR and RIPK2) and one suggestive locus 
(LRRK2) in the Chinese Han population, which has indicated 
the importance of the M?D2-mediated innate immunity 
against the infection by M leprae (6). NOD2 is a member of 
the CARD-containing protein family (CARD 15), which plays 
an important role in the immune response by recognizing the 
bacterial molecules and activating the NF-kB protein (7-10). 
We also found association around RIPK2, also has a CARD 
domain in its C-terminal portion which can interact with 
NOD2 and induce activation of NF-kB (11). Other CARD- 
carrying proteins, including CARD9, CARD10, CARD11, 
CARD 1 4, NOD 1 and BCL 10, have also been shown to facilitate 
NF-kB activation (12). Our subsequent GWAS has further iden- 
tified IL23R and RAB32 as susceptibility loci for leprosy (13), 
which has not only expanded the biological functions of IL23R 
by uncovering its involvement in infectious disease susceptibil- 
ity, but also suggested a potential involvement of autophagocy- 
tosis (RAB32) in leprosy development. 

Toll-like receptors (TLRs) recognize particular molecular pat- 
terns of diverse microorganisms to induce innate immune 
responses. The contribution of the TLR variations to the suscep- 
tibility for leprosy has been investigated in different populations, 
and TLR1, TLR2 and TLR4 variants have been reported to be 
associated with leprosy by previous studies in Indian, Nepalese 
and African populations (14-17). With the exception of TLR1, 
the association evidence of other TLR variants are, however, 
not conclusive. 

Although we did not find a definitive evidence of TOLL/ 
CARD enrichment through an unbiased gene-enrichment ana- 
lysis of our previously published GWAS dataset (13) using 
MAGENTA (18) (Supplementary Material, Table SI), the evi- 
dence of association around NOD2 and RIPK2 in our previous 
findings and the roles of CARD-containing molecules, as well 
as TOLL receptors in the innate immune response against patho- 
gens, led us to hypothesize that other TOLL/CARD genes might 
also play a role in the susceptibility to leprosy. Therefore, to 
further investigate the role of TLRs and CARDs in leprosy, we 
performed a large three-stage candidate gene-based association 
study of TLRs and CARDs in leprosy samples of Chinese Han 
population. 



RESULTS 

The current candidate gene -based genetic association study was 
performed in three stages using the previously published GWAS 
datasets and two new independent sample series of Chinese Han 



population. The GWAS datasets includes the matched 1220 
samples (706 cases and 514 controls) used in our initial 
GWAS analysis (6) and additional controls of 5067 individuals 
used in our subsequent expanded GWAS analysis (13). The 
first independent validation series consists of 1504 cases and 
1502 controls, and the second independent validation series 
includes 938 cases and 5827 controls. The samples of the 
GWAS datasets were described in our previous publications 
(6,13). For the two independent validation series, the cases and 
the controls were matched in terms of age, gender and geograph- 
ic residence. The general characteristics of all the patients and 
control subjects are summarized in Table 1. In total, 3148 
leprosy cases and 12 910 controls of Chinese Han were investi- 
gated in the current study. 

In the current study, we performed a candidate gene-based 
association study on 30 TLR and 47 C47?Z)-related genes that 
were identified through an UCSC Genome Browser. We first 
examined the association evidence within the critical regions 
of these candidate genes in the two published GWAS datasets 
of matched 1220 samples (6) and expanded 6287 samples (13). 
In total, 4363 SNPs within these 77 candidate genes were inves- 
tigated in the two GWAS datasets (Supplementary Material, 
Table S2). After excluding the SNPs located in known suscepti- 
bility loci, SNPs with suggestive evidence of association were 
identified using the following criteria: (i) the P-value of associ- 
ation was<1.0 x 10 _2 intheexpandedGWASanalysis,and(ii) 
the P-value of association showed improvement between the 
first GWAS analysis of 706 cases and 514 matched controls 
and the expanded GWAS analysis of 706 cases and 5581 popu- 
lation controls. A moderate threshold of suggestive association 
was employed here to maximize the power and the chance to dis- 
cover novel associations within these candidate genes. In total, 
eight suggestive SNPs within two 7Zi?-related (ZCCHC11 and 
FREM1) and six GIRD-related (BCL10, KCNQ1, MAP2K1, 
SGSH, NODI and VISA) candidate genes were brought 
forward for further validation. 

In the first validation analysis (Stage 2), six of the eight 
selected SNPs were successfully genotyped in 1504 leprosy 
cases and 1502 healthy controls, while the other two SNPs 
failed in either the assay design (rs5743369) or genotyping ana- 
lysis (rs6084506). The association results of the six SNPs are 
summarized in the Supplementary Material, Table S3. Of the 
six SNPs, two showed significant association (P < 0.005 
after correction for six SNPs tested): rs2735591 (P = 2.18 x 
10" 6 , OR= 1.32) and rs4889841 (P = 1.94 x 10" 4 , OR = 

1.23) . In the second validation (Stage 3), the two SNPs were 
further genotyped in additional 938 cases and 5827 controls. 
SNP rs2735591 showed significant association in the second 
validation sample series (P = 4.58 x 10" 3 , OR= 1.17), but 
rs4889841 failed to show consistent association (P = 0.415, 
OR= 1.04) (Supplementary Material, Table S4). Rs2735591 
showed consistent association among the two independent val- 
idation and previously published GWAS (1220 samples) 
without evidence for genetic heterogeneity (P > 0.05) 
(Table 2). The association at rs2735591 is highly significant 
in the combined validation samples (P = 1.27 x 10~ 7 , OR = 
1.27), even after correction for all the 4363 SNPs tested in 
the current study (-P corr ected = 5-54 x 10~ 4 ) and surpassed the 
genome-wide significance threshold (P= 1.03 x 10 9 , OR = 

1 .24) in the combined GWAS and two independent validation 
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Table 1. Sample summary of 2442 cases and 8082 controls 





Cases 

Sample size 


Mean age 


Mean age at onset 


Males" (%) 


Controls 
Sample size 


Mean age 


Males" (%) 


Validation set 1 


1504 


66.22 


20.96 


83% 


1502 


65.25 


84% 


Validation set 2 


938 


65.05 


24.14 


81% 


5827 


49.82 


65% 


Total 


2442 


65.78 


22.2 


82% 


8082 


52.42 


68% 



'Gender Information was missing for 63 patients and 136 control subjects in all the samples. 



series, consisting of a total of 3148 cases and 7843 controls 
(Table 2). 

We further investigated the association evidence within the 
surrounding region of rs2735591 on lp22 by imputing our 
expanded GWAS dataset (13) using the genetic variation data 
from 1000 Genomes Project (version February 2012) as a refer- 
ence panel to maximize the genetic variation coverage of the 
region in association analysis. In total, we tested the association 
at 1500 imputed SNPs and 296 typed SNPs within this region. 
Rs2735591 is located within a linkage-disequilibrium (LD) 
block of 300 Kb on lp22, where BCL10 as well as several 
other genes or transcripts reside (Fig. 1). Within the LD block, 
there are several other SNPs showing consistent association evi- 
dence and in strong LD with rs2735591, but none of them are 
coding variants. The top SNP within this region was rs233100 
(P=3.73xl0~ 4 and OR =1.26) that is in LD with 
rs2735591 (r 2 = 0.33, LV = 0.90) (Fig. 1). Conditioning on 
rs233100 could eliminate the association at rs2735591 (P = 
0.389 and OR =1.07), while conditioning on rs2735591 
reduces the association at rs233100 (P = 0.015 and OR = 
1.21) (Supplementary Material, Fig. SI), suggesting that the 
associations at the two SNPs are not fully independent, and 
there seems to be one association within the region. 

To investigate the potential association of rs273559 1 with any 
transcriptional regulation activity, we searched Genevar expres- 
sion quantitative trait loci (eQTL) database (19) and found mod- 
erate evidence for the association of rs2735591 with BCL10 
expression level in HapMap3 CHB (Han Chinese in Beijing, 
China) (P = 0.071) and JPT (Japanese in Tokyo, Japan) (P = 
0.032) samples. Moderate evidence for eQTL effect was also 
observed for rs233100 in HapMap3 LWK (Luhya in Webuye, 
Kenya) (P = 0.030) and GIH (Gujarati Indian from Houston, 
Texas) (0.047) samples (Supplementary Material, Table S5). 
Furthermore, by annotating SNPs that are highly correlated 
(r 2 > 0.8) with rs2735591 and rs233100 using the information 
from the databases HaploReg (20) and RegulomeDB (21), we 
discovered that rs2735592, a highly correlated SNP with 
rs2735591 (r 2 = 1, Z)' = 1) is located at the 3' UTR of BCL10, 
which may affect the binding of FOXA1 and FOXA2 transcrip- 
tion factors by damaging the TEF binding motif. Moreover, 
similar to rs2735591, a moderate evidence was observed for 
rs2735592 to affect the expression of BCL10 in HapMap CHB 
(P = 0.071) and JPT (P = 0.043) samples. 

We further investigated and compared the expression of 
BCL10 between the leprosy lesions from 22 patients and the 
normal skins from 33 healthy controls. Significantly lower ex- 
pression of BCL10 was observed in the lesions of patients than 
the normal skin tissues (P = 3.83 x 10~ 5 ) (Fig. 2). 



We also assessed the connections between BCL10 and the 
eight previously identified susceptibility genes for leprosy, in- 
cluding IL23R, RAB32, RIPK2, TNFSF15, CYLD, LACC1, 
CCDC122 and NOD2. BCL10 was among the top 10 most sig- 
nificantly correlated gene with other leprosy known loci (P = 
6.07 x 10~ 4 , Supplementary Material, Table S6), connected 
mostly through TNFSF15, RIPK2, CYLD and NOD2 (Supple- 
mentary Material, Fig. S2). The most frequently connecting 
terms include kappab, caspase, metabotropic, inflammatory 
and death. The involvement of NF-kB has been known to be a 
major regulator of downstream inflammatory responses for 
mycobacteria infection (22,23). This provides further support- 
ing evidence for the involvement of BCL10 in the development 
of leprosy. 

DISCUSSION 

Through a three-stage association study in three independent 
case-control series, we have identified a novel association at 
rs2735591 (P = 1.03 x 10" 9 , OR = 1.24), locating 889 bp up- 
stream of BCL10. Further fine mapping analysis of the surround- 
ing region of this SNP revealed a stronger, but correlated 
association at rs233100, a noncoding polymorphism located 
28 Kb to BCL10. Besides BCL10, there are several other genes 
or transcripts within the LD region of the novel association 
locus (Fig. 1). Clorf52 and DDAH1 locate next to BCL10. 
Clorf52 is a hypothetical protein-coding transcript without 
functional annotation. DDAH1 belongs to the dimethylarginine 
dimethylaminohydrolase (DDAH) gene family and encodes an 
enzyme involved in nitric oxide generation by regulating cellular 
concentrations of methylarginine, which in turn inhibits nitric 
oxide synthase activity (24). 

Given that all the significant SNPs and the top SNP (rs233 1 00) 
within the locus are noncoding variants, it is not unreasonable to 
hypothesize that the observed association at lp22 might be due to 
the effect of the causal variant of the association on the transcrip- 
tional regulation of BCL10 expression. This is supported by evi- 
dence from eQTL databases that our associated SNP might affect 
BCL10 expression. Moreover, gene regulatory databases 
support this notion by pointing evidences that the associated 
SNP is likely to affect the binding of FOXA1 and FOXA2 tran- 
scription factors, which might in turn affect the expression of 
BCL10. We also investigated the expression of BCL10 gene in 
skin tissues and found that the level of BCL10 mRNA was 
down regulated in the lesions of leprosy compared with the 
normal skin tissues. It has been shown in vivo that BCL10 defi- 
ciency increases susceptibility to bacterial infections (25). 



Human Molecular Genetics, 2013, Vol. 22, No. 21 4433 



MAF controls P OR (95% CI) Q-tesf 

0.29 1.66 x 10~ 2 1.24(1.04-1.48) NA 

0.30 2.18 x 10~ 6 1.32 (1.17-1.48) NA 

0.29 4.58 x 10~ 3 1.17(1.05-1.3) NA 

0.30 1.27 x 10~ 7 1.27 (1.14-1.33) 0.127 

0.29 1.03 x 10~ 9 1.24(1.16-1.33) 0.954 



Table 2. Summary of the association results of rs273559 1 



SNP rs2735591 



Chromosome 1 

Position 85 517 060 

Minor/Major Allele T/C 

Test allele 3 T 

Gene BCL10 

MAF cases 

GWAS 1220 samples (Stage l) b 0.33 

Validation stage 2 0.36 

Validation stage 3 0.32 

Combined Validation stages 2 and 3 0.34 

Combined stage 1—3 0.34 



MAF, minor allele frequency; P, P-value; OR, odds ratio; CI, confidence interval. 
a Test allele: the allele that was used for estimating the OR. 
b Data from the previously published GWAS (6). 
°P- value from Cochran's Q-test of heterogeneity. 



In addition, BCL10 was the eighth most significantly connected 
gene to our previously identified leprosy susceptibility genes 
based on the GRAIL analysis, indicating that there is a strong 
biological relationship between BCL10 and previously impli- 
cated susceptibility genes. These results suggest that BCL10 is 
likely the susceptibility gene of this novel association, although 
further fine mapping and functional investigations are needed to 
confirm the biological mechanism underlying the association of 
BCL10. 

The protein encoded by BCL10 contains a caspase recruit- 
ment domain (CARD) and plays an important role in the activa- 
tion of NF-kB, an important regulator of immune response 
against the infection by M. leprae (26,27). BCL10 regulates 
the activation of an NF-kB signaling pathway by forming 
two complexes CARMAl-BcllO-MALTl and CARD9-BcllO- 
MALT1 in lymphoid (L-CBM) and myeloid (M-CBM) cells, 
respectively (28-30). As a regulator of innate immunity, an 
M-CBM complex acts on NF-kB activation downstream of 
immunoreceptor tyrosine-based activation motif (ITAM) and 
hemlTAM-coupled receptors. M-CBM can also regulate the 
activation of mitogen-activated protein kinases downstream of 
TLRs and the intracellular bacterial sensor NOD2 through 
R1PK2 as a part of immune response for microbe infection 
(28). As a regulator of adaptive immunity, L-CBM is crucial 
for the activation of NF-kB in numerous immune receptor sig- 
naling pathways, including the T-cell receptor (TCR) and 
B-cell receptor (BCR) signaling pathways. It has been suggested 
that L-CBM might also be involved in the TLR '/-mediated sig- 
naling for NF-kB activation in B lymphocytes. Paul etal. inves- 
tigated the TCR-to-NF-KB signaling and indicated that antigen 
signaling through the TCR triggers both activation of NF-kB 
and the selective autophagy of BCL 10, inhibiting NF-kB activa- 
tion. This study demonstrates that selective autophagy of 
BCL 10 is a pathway-intrinsic homeostatic mechanism that 
modulates TCR signaling to NF-kB in effector T cells and 
may protect T cells from adverse consequences of unrestrained 
NF-kB activation (31). Our discovery of BCL10 as a new sus- 
ceptibility locus may suggest that adaptive immunity, in add- 
ition to the M?D2-mediated innate immunity, also play an 
important role in leprosy. 



We did not observe supporting evidence for the association of 
TLRs with leprosy susceptibility in Chinese population. In our 
previous study (13), we analyzed the reported SNP rs5743618 
(I602S) of TLR1 in Indian and African populations (17) and 
another SNP rs 176 16475 of TLR1 showing suggestive associ- 
ation in our extended GWAS analysis, but did not observe any 
evidence of association in further validation, despite the fact 
that our sample of 330 1 cases and 5299 controls should have suf- 
ficient power (99%) to detect the association of rs5743618 with 
OR = 0.31 (assuming the same effect size in Indians) at P = 
0.05. In the current study, a total of 1 599 SNPs within the 30 can- 
didate genes of the TLR family, including 25 SNPs within the 
TLR1 locus (chromosome 4: 38.4-38.6 Mb) were investigated 
by in silico association analysis in 706 cases and 5581 controls. 
Of the 1599 SNPs, only two TLR SNPs showed suggestive asso- 
ciations with P < 0.01, but both were not confirmed by further 
validation study. Power analysis indicates that the sample of 
706 cases and 5581 controls should provide sufficient power 
(>85%) to detect associations in SNPs with a moderate effect 
(OR = 1.50) and frequencies as low as 5% at a significance 
threshold of 0.01. Taken together, our studies did not reveal 
any evidence for the associations of TLRs in Chinese population. 
The absence of association evidence for TLRs in Chinese popu- 
lation is intriguing, given that the TLRs are known to be the crit- 
ical mediators of innate immune recognition of microbial 
pathogens, including M. leprae (32), and the association of 
TLR1 with leprosy was replicated in Indian and African popula- 
tions (17). Although the LD structure of the TLR1 region is 
similar between the Chinese (CHB) and Indian (GIH) samples 
of Hapmap3 data (data not shown), we could not rule out the pos- 
sibility that there could still be differences in some SNP's allele 
frequency or LD, and consequently the associations identified in 
Indian population may not be well captured in our study. Further 
studies are needed to understand whether the disparity of associ- 
ation evidence for TLRs between Chinese and Indian/ African 
population may suggest the genetic heterogeneity of leprosy sus- 
ceptibility between ethnic populations. 

In summary, we have performed a comprehensive candidate 
gene association study of TLR and CARD by deep in silico asso- 
ciation analysis in our previously published GWAS datasets and 
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Figure 1. Regional association plots of lp22 based on the imputation analysis in the expanded GW AS dataset of 706 cases and 5581 controls (13) (See Methods). The 
P-values for the SNPs (shown as — log| 0 P-values on the 7-axis) were plotted against their mapping positions (X-axis). Imputed SNPs are denoted as circles, whereas 
typed SNPs as squares. The color of each SNP reflects its r 2 value with the confirmed SNP rs273 559 1 . Estimated recombination rates [based on the 1 000 Genomes CHB 
(Han Chinese in Beijing, China) and JPT (Japanese in Tokyo, Japan)] were plotted in light blue. Plots were generated using LocusZoom (39). Top SNP (rs233100) as 
well as the confirmed SNP (rs2735591) within the LD block of 300 Kb on lp22.3 region were labeled out. 



further validation analysis in two large independent sample 
series of leprosy. By integrating the hypothesis-driven study 
with genome-wide analysis, we are able to identify novel asso- 
ciations beyond the original GWAS analysis. Our discovery of 
the novel association at lp22 implicates BCL10 as the new sus- 
ceptibility gene for leprosy, highlighting the important role of 
both innate and adaptive immune responses in leprosy. 



control institutions and clinical assessments at the time of 
blood taken (looking for evidence of leprosy such as claw 
hand, lagothalomas or foot drop and pathology of skin lesions, 
etc.). The controls were recruited without history of leprosy or 
a family history of leprosy or other autoimmune diseases. 



MATERIALS AND METHODS 

Ethical statement 

The study was approved by the institutional IRB committees at 
the Shandong Provincial Institute of Dermatology and Venere- 
ology, Shandong Academy of Medical Science. All the cases 
and the controls were recruited with written informed consent. 



Subjects 

Stage 1 discovery analysis was performed insilico using our pre- 
viously published GWAS datasets (6,13). The validation ana- 
lysis of Stage 2 was performed in an independent series of 
1504 cases and 1502 controls, and further validation analysis 
of Stage 3 was performed in the second independent series of 
938 cases and 5827 controls. All the patients and controls of 
the three stages were recruited from northern China and 
matched in terms of age, gender and residence, which minimizes 
the possibility of confounding due to population stratification. 
High correlation between subpopulation genetic and geographic 
structures of Han Chinese has been demonstrated to be a good 
proxy for genetic matching (33). 

Table 1 shows the clinical information of all the samples. 
Leprosy was diagnosed on the basis of consensus by at least 
two dermatologists. The clinical diagnoses of all the leprosy 
cases were based on medical records stored in local leprosy 



Candidate gene and SNP selection 

By searching in the UCSC Genome Browser (http: //genome. 
ucsc.edu/) using keyword 'TLR' or 'CARD', we found 37 candi- 
date genes of TLR family and 53 candidate genes of CARD 
family. After removing overlapping transcripts, a total of 77 can- 
didate genes of TLR (30 genes) and CARD (47 genes) were 
obtained. We then determined the critical region of each candi- 
date gene (coding region + 20 Kb) and extracted all the associ- 
ation results within these critical regions from our previously 
published GWAS analyses in genetically matched 1220 
samples (706 cases and 514 controls) (6) and the expanded 
GWAS sample of 706 cases and 5581 population controls 
(13). There were a total of 4363 SNPs within 77 candidate 
genes extracted from the GWAS dataset, 2764 SNPs from 
CARD-related genes and 1599 SNPs from TLR-related genes. 
SNPs with suggestive association evidence were identified by 
using the following criteria: (i) the P-value of association was 
<0.01 in the expanded GWAS analysis; (ii) the P- value of asso- 
ciation showed improvement between the initial GWAS analysis 
of 706 cases and 5 14 matched controls and the expanded GWAS 
analysis of 706 cases and5581 population controls and(iii) SNPs 
located in known susceptibility loci, NOD2, PJPK2, were 
excluded. Finally, eight suggestive SNPs within two TLR 
genes (ZCCHC11, FREM1) and six CARD genes (BCL10, 
KCNQ1, MAP2K1, CARD14, NODI and VISA) were brought 
forward for further validation analysis. 
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Figure 2. Expression analysis of BCL10 in skin tissues. A significantly lower ex- 
pression of BCL10 was observed in the lesions of patients than the normal skin 
tissues of healthy controls (P = 3.83 x 10~ 5 ). 



Gene-set enrichment analysis of leprosy genetic associations 

MAGENTA (18) was employed to test for the enrichment of 
genetic associations within pathways by using the genome-wide 
association results from our published GWAS dataset (13) as 
input. This dataset includes 1 701 673 genotyped and imputed 
SNPs. 

Evidence of significance was taken from the ^-values as pre- 
sented in the TDR_95PERC_CUTOFF' of MAGENTA's 
output. By using a less stringent level of 20% FDR (False 
Discovery Rate), we observed four pathways (Supplementary 
Material, Table SI), containing 141 unique genes, which is 
0.7% of all the known genes in human genome. Among these 
141 unique genes, 2 are TLRICARD genes, which comprises 
2.5%) of the total TLRI CARD genes within the genome, but Pear- 
son's chi-square test with Yates' continuity correction (as imple- 
mented in R statistical package under the function chisq.test) did 
not reach statistical significance and only showed suggestive evi- 
dence of enrichment (one-sided P = 0.0995 ). We further took 1 0 
000 random gene sets comprising the same number of genes as 
the actual set at a level of 20% FDR (141 genes). Among these 
10 000 random sets, there were 884 sets containing two or 
more TLRICARD genes, giving us a P-value of 0.0884. As this 
exceeds the 5% conventional cut-off for the type-I error rate, 
this indicates that there is no significant evidence of enrichment 
for TLRICARD genes. 



Genotyping analysis 

Genotyping analyses of all the validation samples were con- 
ducted by using the Sequenom MassArray system. Approxi- 
mately 15 ng of genomic DNA was used to genotype each 
sample using the Sequenom MassArray system (San Diego, 
USA). The sample DNA was amplified by a multiplex PCR, 
and the PCR products were then used for a locus-specific single- 
base extension reaction. The resulting products were desalted 
and transferred to a 384-element SpectroCHIP array. Allele de- 
tection was performed using MALDI-TOF MS. The mass spec- 
trograms were analyzed using the Sequenom MassARRAY 
TYPER software (San Diego, USA). In each validation series, 
we excluded SNPs with a call rate of <95%, low minor allele 
frequency (<0.01) or deviation from Hardy- Weinberg 
equilibrium proportions (P < 0.01) in the controls. Out of 
eight suggestive SNPs validated in Stage 2, one SNP was 



unsuccessfully designed by Sequenom and the other one had 
poor clustering (Supplementary Material, Fig. S3), hence 
cannot be analyzed (rs6084506 and rs5743369, respectively, 
both SNPs are in CARD genes). 

Imputation of ungenotyped SNPs within lp22 locus 
in the expanded GWAS dataset 

Imputation of ungenotyped SNPs within lp22 locus was per- 
formed in the GWAS dataset ( 1 3) of 706 cases and 5581 controls 
by using IMPUTE version 2.2.2 and the genetic variation data 
from the 1000 Genomes Proj ect (version February 20 1 2 ) as a ref- 
erence panel, which includes 1092 individuals from Africa (246 
samples), North America (181 samples), Asia (286 samples) and 
Europe (379 samples). Imputed genotypes with a probability of 
<90%, as well as SNPs with imputation certainty < 80%, minor 
allele frequency < 1% and a missing rate of > 1% were excluded 
from further analysis. In total, there were 1500 SNPs that were 
successfully imputed within this region. Together with 296 gen- 
otyped SNPs, a total of 1796 SNPs within lp22 were tested for 
association in the 706 cases and 5581 controls. 



Statistical analysis 

The Cochran- Armitage trend test was used to analyze the geno- 
type -phenotype association in each of the two validation 
samples using Plink vl.07 software (34). The Cochran - 
Mantel-Haenszel test was used to test genotype-phenotype as- 
sociation in the combined GWAS and validation samples by 
treating the GWAS (1220 samples) and two validation samples 
as independent studies. An expanded GWAS dataset described 
in our previous study (13) (706 cases and 5581 controls) were 
only used in the initial in silico association analysis to select 
SNPs for further validation analysis. The final association ana- 
lysis of the combined samples was performed by using the 
matched GWAS dataset of 706 cases and 514 controls and the 
two independent validation series, consisting a total of 3148 
cases and 7843 controls (this number excludes those extended 
samples used as controls with reasons as previously explained 
(13)). Cochran's Q test was performed to evaluate the signifi- 
cance of heterogeneity among the three studies, and the 
P- value of <0.05 after correction for multiple SNP testing 
was considered as significant heterogeneity. If the corrected 
P- value was >0.05, the fixed-effect model (Mantel -Haenszel) 
was used to combine the results of the three independent samples 
(35); otherwise, the random-effects model (DerSimonian-Laird) 
was used (36). Multiple SNP testing correction was performed 
using Bonferroni correction (37). 



Gene relationship investigation with GRAIL 

The web-based software called Gene Relationships Across 
Implicated Loci (GRAIL) (38) was used to investigate the bio- 
logical relationship between the eight known susceptibility 
genes (including LL23R, RAB32, R1PK2, TNFSF15, CYLD, 
CCDC122, LACC1 and NOD2) (6,13) and 30 and 47 candidate 
genes in the TLR and CARD gene family, respectively. GRAIL 
provides a relatedness score of two genes by measuring the text- 
based similarity based on the text in PubMed abstracts. Related- 
ness of the genes tested is not biased with respect to phenotype as 
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GRAIL does not take any information on the phenotype used. 
We selected the December 2006 PubMed data as a basis for 
text mining and CHB and JPT HapMap population as a basis 
for LD calculations in GRAIL. 

Multiplex-branched DNA assay 

FFPE skin tissue samples were obtained from the Shandong 
Provincial Institute of Dermatology and Venereology 
between 1989 and 2007 (33 cases and 34 controls), with the 
cases being collected generally earlier (older batch) than the 
controls. Samples were all of Northern Han Chinese descent 
with the mean age in the cases being 46.5 and the mean age 
in the controls being 42.6 (Student's i-test P-value = 0.33). 
Modified probe design software was used to design oligo- 
nucleotide probe sets for BCL10. The bDNA assays were per- 
formed according to the recommended procedure of 
QuantiGene Reagent System. Relative levels of gene expres- 
sion were normalized to the expression of PPIB and HPRT1 
to control for variability in the preparation of the tissue 
samples as well as input material. Eleven cases and one 
control had low fluorescence intensity and, therefore, were 
not included in the analysis. Hence, the remaining samples to 
be analyzed include 22 cases and 33 controls. Since the expres- 
sion data did not follow a normal distribution, statistical signifi- 
cance was tested using a non-parametric test, Wilcoxon's rank 
sum test implemented in R package 'exactRankTests'. Signifi- 
cance was defined at P < 0.01. 



URLS 

Genevar, http://www.sanger.ac.uk/resources/software/genevar/ 
j ava/genevar.j nip . 

RegulomeDB , http : //regulome . Stanford. edu/. 

HaploReg, http://www.broadinstitute.org/mammals/haploreg/ 
haploreg.php. 

MAGENTA, http://www.broadinstitute.org/mpg/magenta/. 
GRAIL , http : //w w w .broadinstitute . org/mp g/ grail/. 

SUPPLEMENTARY MATERIAL 

Supplementary Material is available at HMG online. 
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